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ABSTRACT 

Motivation: Event extraction using expressive structured 
representations has been a significant focus of recent efforts 
in biomedical information extraction. However, event extraction 
resources and methods have so far focused almost exclusively on 
molecular-level entities and processes, limiting their applicability. 
Results: We extend the event extraction approach to biomedical 
information extraction to encompass all levels of biological 
organization from the molecular to the whole organism. We present 
the ontological foundations, target types and guidelines for entity and 
event annotation and introduce the new multi-level event extraction 
(MLEE) corpus, manually annotated using a structured representation 
for event extraction. We further adapt and evaluate named entity and 
event extraction methods for the new task, demonstrating that both 
can be achieved with performance broadly comparable with that for 
established molecular entity and event extraction tasks. 
Availability: The resources and methods introduced in this study are 
available from http: //nactem. ac.uk/MLEE/. 
Contact: |pyysalos@cs.man.ac.uk| 

Supplementary information: Supplementary data are available at 
Bioinformatics online. 



1 INTRODUCTION 

A detailed understanding of biological systems requires the ability 
to trace cause and effect across multiple levels of biological 
organization, from molecular-level reactions to cell ular, tis s ue- an d 
organ-level effects to organism-level outcomes (iKitanoL l2002h . 
Consequently, any effort aiming to comprehensively represent 
biological systems must address entities and processes at all of these 
levels. 

This challenge has so far been only partially met in biomedical 
information extraction (IE) and text mining, which aim to improve 
access to domain knowledge by automating aspects of processing the 
literature. Until recently, efforts in domain IE were primarily focused 
on the basic task of recogn izing mentions o f relevant entities such as 
genes and proteins in text iYchet a/.ll2005h and on the extraction of 
pairwise relations be tween these representing, for example, pr otein- 
protein interactions (iKrallin ger et al. l l2007l : lNedenell2005h . Such 
representations lack the capacity to capture any but the simplest of 
associations. 

In recent years, there has been increasing interest in the extraction 
of structured representations capable of capturing associations of 
arbitrary numbers of participants in specific roles. Such approaches 
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to IE, frequently termed event extraction, are capable of representing 
complex associations — such as the binding of a protein to another 
inhibiting its localization to a specific cellular compartment 
(Fig -ID — and open many new opportunities for domain text mining 
applications rangi ng from semantic search to database and pathway 
curation support (lAnaniadou et a/.l l20ld) . There is significant 
momentum behind the move to richer representations for IE: more 
than 30 groups have intr oduced methods fo r biomedical event 
extraction in shared tasks (iKim et aZl 12011 al Ibl): event- annotated 
corpora have been in troduced for many ext raction targets, including 
DNA methyla t ion (Qhta et a/.L l2011ah , protein modifications 
(IPvvsalo a/.U2011 ) and the molecular mechanisms of infectious 
diseases Jpvvsalo g^a/.Ll20123) : event extraction methods have been 
applied to automatica lly analyze all 20 million PubMed abstracts 
(iBiorne et a/.l l2010h : and event extraction analyses are being 
integrated into literature search systems such as MEDl43and applied 
in supp ort of advanced tasks such as pathway curation (lOhta et a/1 
l2011bh . 

While the event extraction approach has been demonstrated to 
be applicable to a variety of extraction targets across different 
subdomains of biomedical science, related efforts all share a key 
restriction: nearly exclusive focus on molecular-level entities and 
events0 Entities such as proteins and genes and events such as 
binding and phosphorylation are an important part of the picture 
of biological systems, but still only a part, and any IE approach 
aiming to capture the whole picture must also consider other levels 
of biological organization. 

In this study, our aim is to extend the scope of existing event 
extraction resources and methods to levels of biological organization 
ranging from the subcellular to the organism level as a step toward 
developing the capacity for the automatic extraction of these targets 
from the entire available literature. Toward this end, we propose 
relevant entity and event types for annotation across these levels 
with reference to community- standard ontologies, develop a set of 
detailed guidelines for their annotation in text and create structured 
event annotation marking over 8000 entities and 6000 events in 
abstracts relevant to cancer biology, previously annotated by domain 
experts to identify spans of text relevant to their interests. Using 
this data, we perform experiments using state-of-the-art methods 
for both entity mention detection and event extraction to analyze 



http://www.nactem.ac.uk/medie/. 

^Some recent tasks have considered also organi sms (primarily unicellular, 
see e.g. lBossv et <2/.L[2012l : |Pyvsalo et <2/.Ll2012ch . but prior event extraction 
efforts have not specifically targeted entities and processes between the 
molecular and organism levels. 
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MAD-3 to inhibits p65 




localization to the nucleus 



PrDteiili-l r&gulales Protein-Z Prntein-l regulates Proftein-Z 



I RggulgtlgrT l 



Fig. 1. Example sentence with event annotation. Prot, -Reg and Cell 
COMP. abbreviated for Protein, Negative regulation and Cell component, 
respectively 



the feasibility of extraction using existing tools, further evaluating 
the benefits of specific adaptations of such tools to the novel task. 



2 APPROACH 

2.1 Corpus texts and reference annotation 

We selected as the starting point for our study a recently introduced 
corpus of 262 PubMed abstracts on angiogenesis, the development 
of new blood vessels from existing ones. The domain involves 
a tissue/organ-level process that is closely associated with cancer 
and other organism-level pathologies and whose mo lecular basis 
is increasingly understood (ICarmeliet and Jairll200d) , and domain 
texts thus represent a good test case for structured IE across multiple 
levels of biological organization. 

The corpus texts were previously annotated bv lWang et al\ \20\\h 
using a typed-span representation, marking references to molecular 
level entities, cells, tissues and domain-relevant processes. We 
use these annotations created by domain experts as a reference 
for identifying statements of interest for our annotation, which 
focuses on introducing structured event annotation and solidifying 
the ontological basis of the existing entity annotation. 

2.2 Representation 

We apply the specific event representation first formalized in 
the BioNLP 2009 Shared Task on event extraction and applied 
in numerous resources and methods introduced since. In this 
representation. Entity mentions (or entities, for short) are marked as 
continuous spans of text identified with a type (e.g. Protein), and 
event structures (or events) are ;i-ary associations of participants — 
entities or other events — each of which is identified as participating 
in the event in a specific role (e.g. Theme and Cause). Each event is 
assigned a type from a fixed set defined for the task (e.g. Binding 
and Phosphorylation) and is associated with a specific span of text 
stating the event, termed the event trigger. Events can additionally be 
marked with modifiers identifying the event as being, e. g. explicitly 
negate d, or stated in a speculative context. We refer to (iKim et al.l 
l2011ah for a detailed presentation of the representation. 

Given the starting point of the existing corpus annotations, 
our event annotation effort proceeds from spans to a structured 
representation that can represent complex associations between 
arbitrary numbers of entities (Fig.[D and many other aspects that the 
typed-span representation cannot, such as the direction of causality 
(Fig.EJ. 

In addition to selecting the general form of representation, to 
define a specific event annotation scheme, we must also fix the 
annotated entity and event types as well as the roles, participant 
scopes and modifiers applied. For these, we build on previously 
introduced resources targeting the molecular level, basing our 
extensions on domain ontologies. 



Prot€ini-2 is regulated by Prolein-l PFOteinr2 is regulated by Prntein-l 

Fig. 2. Span versus structure. Although a representation using nested, typed 
spans (left) can capture the fact that specific entities participate in a process, 
it lacks the mechanisms to express, e.g. the direction of causality. The 
structured event representation (right) differentiates Themes from Causes 

2.3 Ontological basis 

We take as basic the division between continuants (or endurants) 
and occurre nts (perdurants, processes or events) (see e.g. 
ISmithLl2003h and adopt the general principle followed also in major 
previously introduced event- annotated resources that references 
to continuants such as material entities are annotated using the 
entity representation and references to occurrents such as biological 
processes are annotated as events^ 

In the definition of our annotation scheme, we aim for 
compatibility with existing event- annotated corpora — ^primarily the 
five 'main task' corpora introduced in the BioNLP Shared Tasks — 
to allow these to be used together with the annotations that we 
create and to assure that our extensions are coherent with existing 
resources derived from these corpora. Thus, for molecular-level 
entity and process types, we adopt the scope, semantics and 
annotation guidelines of these resources as closely as possible 
without compromising coverage of mentions marked as relevant 
by domain experts. For entities and processes not in scope of 
previous event resources, we propose new types for annotation, 
basing type and scope definitions and annotation guidelines on major 
community-curated ontologic al resources f r om th e open biomedical 
ontologies (OBO) foundr>EI dSmith et all l2007h . In brief, before 
primary annotation, we analyzed mentions marked in the reference 
annotation to identify entity and process types not in scope of 
previously defined event annotation guidelines and then defined 
new types and guidelines for annotation with reference to selected 
ontologies. These are summarized in the following. 

2.4 Annotation scheme 

The focus our extensions of previously proposed event annotation 
schemes is on anatomical entities such as cells, tissues and organs 
and processes involving them such as growth, remodeling and 
death! 

For anato mical entity types, w e adopt a top-level division by 
granularity (iKumar et alx I2OO4I) based primarily on the upper- 
level stri icture of the Comm on Anatomy Reference Ontology 
(CARO) (iHaendel et a/ll2008h . an organism-independent ontology 
of anatom y based on the hun i an- spe c ific Fo undational Model of 
Anatomy (iRosse and Meiinoi I2003L l2008h. as outl i ned iii our 
previous work on anatomical entities ( Pvvsalo et a/.l [2012bh . To 
account for pathological anatomy-level entities (e.g. glioma) — out of 

^We use the terms 'entity' and 'event' primarily following usage in IE, to 
identify forms of representation, not ontological categories. In particular, the 
latter term does not denote a category distinct from processes. 
^ http ://obof oundry . org . 

^Although the existing corpus annotation of iWang gr'^/lbOllh identifies 
such mentions, they are typed nonspecifically, using e.g. Positive 
REGULATION to mark 'developmenf and Negative regulation for '[cell] 
death'. 
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Table 1. Primary entity types, related ontology terms and annotation counts 



Type Term(s) 



Organism 



P^j *c ause >^gy lation of heart growth 
Reptin regulates the growth of the heart > 

Pfil^ltffr ■-^'-'^'-"1 Regulation ^^^^^^G?^^iirthK^^^""^* ^^ 

R epti n reg u lates the g rowth of the heart. 

Fig. 3. Annotation with detailed GO terms (top; hypothetical) and event 
annotation with general types (bottom; applied) 

scope of onto logies of canonical a natomy — we draw on the approach 
proposed by (ISmith et a/.Ll2005h . Tabled] summarizes the primary 
entity types applied in the annotation!! 

For event types, we draw primarily on the biological 
pro cess subontology of the gene ontology (GO) (A shburner 
As in previous event- annotated resources, we consider 
only general upper-level GO terms such as growthco: references 
to specific processes included in GO through composite terms 
such as regulation of heart growthgo are captured using 
the explicitly structured representatior0 (Fig. |3]). We also capture 
general statements of causal association using Regulation t y pes, a s 
in previous event annotation efforts (see e.g. iKim et ali l2008b . 
Following the scope of the reference annotation, we introduce 
event annotation also for intentionally planned processes (e.g. 
injectio n) as outlined in the Ont ology for Biomedical Investigations 
(OBI) (iBrinkman et all l2010h . using a single, non-specific type 
Planned process for their annotation. We additionally introduce 
a Breakdown event for annotating pathological processes that 
result in the breakdown of anatomical structures. Finally, we apply 
the domain- specific Blood vessel development type to annotate 
references to blood vessel development through expressions such 
as 'angiogenesis' that incorporate both the process and the affected 



^Note that we differentiate between types applied in annotation and their 
(broadly) corresponding ontology types. 

^This annotation strategy can be viewed as partly an alogous to efforts to 
make GO term structure explicit jMungall Jni201ll) . 



Examples Count 



722 

49 

18 

176 

514 

426 

1198 

145 

6 

142 

15 

910 

944 
2962 



entity. Expressions such as 'blood vessel development' that allow 
explicitly structured annotation are marked with a separate entity 
annotation (e.g. 'blood vessel') and an event (e.g. 'development') 
taking the entity as its Theme. The primary event types are 
summarized in Table [2] 

For event participants, we apply otherwise standard roles included 
also in previous efforts (e.g. Theme and Cause) but introduce the 
role Instrument for distinguishing entities used to carry out planned 
processes from those that undergo the effects of the process^ Also 
as in previously introduced event corpora, we apply two binary 
modifiers. Negation and Speculation, marking events as explicitly 
negated (e.g. 'cells did not proliferate') or stated in a speculative 
context (e.g. 'growth might be inhibited'), respectively. 

We refer to the detailed annotation guidelines (IPvvsalo et all 
l2012ah for specifics of the annotation, but note here one 
systematic difference between our annotation and the scope of 
the reference ontologies: the ontologies define idealized types — 
canonical anatomy and physiological processes — but texts primarily 
refer to real-world instances that do not fill these exacting criteria 
teada and Hunteiil201ll) . We thus interpreted the scope of mentions 
marked with a specific type to include not only the corresponding 
(canonical) types defined in ontologies but also variants such 
as entities or processes influenced by mutation, including also 
pathological variants. As specific examples, we mark 'cancer cell 
as Cell', and '[cancer] growth' as Growth. 

2.5 Annotation process 

Primary annotation was performed by a PhD biologist with more 
than a decade of experience in text annotation who had previously 
coordinated several event annotation efforts (TO). Annotations wer e 
made using the brat rapid annotation tool (IStenetorp et a/.ll2012l) . 



^For example in 'rats were injected with hyperforin\ the Organism mention 
{'rats') is the Theme of the Planned process {^injected') and the Drug or 
COMPOUND mention ('hyperforin') is the Instrument. 



Organism" 
Anatomy 

Organism subdivision 
Anatomical system 
Organ 

Multi-tissue structure 

Tissue 

Cell 

Cellular component 
Developing anatomical structure 
Organism substance 
Immaterial anatomical entity 
Pathological formation 
molecull 

Drug or compound" 
Gene or gene product" 



Single cell org . caro^ rif^i^lti-cellular org.^ 

Organism subdivisioncARo 
Anatomical systemcARo 
Compound organcARo 
Multi-tissue structurecARo 
Portion of tissuecARo 
CellcL 

Cellular componentoo 

Developing anatomical structureEHDAA 
Portion of organism substancecARo 
Immaterial anatomical entitycARo 
CancerooiD, benign neoplasmnoiD 

Inorganic molecular entitychEBi, drugchsBi 
Geneso, RNAchebi, proteincHEBi 



Human, mice, C. albicans 

Head, thorax, hindlimb, legs 

Central nervous system, pulmonary system 

Heart, eyes, skin 

Blood vessel, peritoneal membrane, lymph nodes 
Endothelium, adipose tissue, capillary 
Endothelial cells, HUVECs, pericyte, cancer cells 
Nuclei, focal adhesions, extracellular matrix 
Embryo 

Blood, serum, plasma, urine 

Lumen, preperitoneal space, marrow cavity 

Tumor, colorectal cancer, gliomas 

Oxygen, ethanol, bevacizumab, thalidomide 
VEGF, p53, IL-8, endostatin, thrombin 



Labels in gray identify informal categories used in evaluation. 

" Annotated also in previously introduced event extraction resources, to identifies a term t in an ontology o; ontology identifiers are OBO Foundry prefixes (namespaces). 
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Table 2. Primary event types, argument roles, related ontology terms and annotation counts 



Type 



Arguments 



Term(s) 



Examples 



Count 



Anatomical 

Cell proliferation 
Development 

Blood vessel development 
Growth 
Death 
Breakdown 
Remodeling 

Molecular 

Synthesis 

Gene expression" 

Transcription" 

Catabolism" 

Phosphorylation" 

Dephosphorylation" 
General 

Localization" 

Binding" 

Regulation" 

Positive regulation" 

Negative regulation" 
Planned 

Planned process 



Theme 
Theme 

Theme, At-Loc 

Theme 

Theme 

Theme 

Theme 



Theme 
Theme 
Theme 
Theme 
Theme, Site 
Theme, Site 

Theme, At/From/To-Loc 
Theme, Site 
Theme, Cause, Site 
Theme, Cause, Site 
Theme, Cause, Site 

Theme, Instrument 



Cell prolif erationoo 
Developmental processoo 
Blood vessel developmento. 
Growthoo 
Deathoo 

Tissue remodelingoo 



Biosynthetic processoo 
Gene express ionoo 
Transcription, DNA-dependentgo 
Catabolic processoo 
PhosphorylationQo 
Depho sphory 1 a t i ongo 

Local izationoo 

Bindingoo, biological adhesionoo 
Biological regulationoo 
Pos . regulation of biol.proc.Qo 
Neg . regulation of biol.proc.Qo 

Planned 

process OBI 



proliferating [ECs], [MCs] accumulated 133 

[skin ] development, [stress fiber ] formation 316 

angiogenesis, neovascularization 855 

growth [of arteries], [tumour] growth 169 

[connective tissue] necrosis, [cell] apoptosis 97 

[ECM] degradation, damage [to tumor cell] 69 

[vascular] remodeling, changes [in 33 
membrane ] 

[ATP] synthesis, production [ofNOS] 17 

expression [of VEGF] 435 

[VEGF] mRNA expression 37 

[p53 ] breakdown 26 

phosphorylation [ofKDR] 33 

[Mcl-l ] depho sphorylation 6 

[VEGF] colocalized, [VPF was] secreted 450 

[ cell] adhesion, [ GDP- ]bound [Rab5a ] 184 

[ aMSH] modulates [ activation ofAP-1 ] 113 

[insulin ] stimulates [VEGF expression ] 1 327 

Inhibition [ of NO synthase by L-NAME ] 921 

injection [of U -995], [U FT] administration 643 



Labels in gray identify categories used in evaluation: events of the Anatomical category involve Organism or Anatomy entities (Table 0; Molecular involve Molecule entities; 
others can involve any entity type. 

"^Annotated also in previously introduced event extraction resources. 



Detailed annotation guidelines were prepared based on those 
for the GENIA and BioNLP Shared Task guidelines and refined 
throughout annotation to clarify ambiguous cases and document 
specific decisions made in annotation. \ ye refer to the supplemen tary 
documentation and these guidelines (IPvvsalo et al for 
further details of the annotation scheme and the detailed definitions 
of all annotated types. 



3 METHODS 

This section presents the automatic entity mention detection and event 
extraction methods applied in this study, their adaptation to the novel 
extraction targets and the experimental setup. 

Following standard practice in domain event extraction studies, we divide 
the automatic extraction task into two separate stages, the detection of entity 
mentions and the extraction of events involving these and evaluate system 
performance on these two separately. 

3.1 Entity mention detection 

For entity mention detection experiments, we applied NERsuite, a named 
entity recognition toolkit based on th e CRFsuite implementa tion iOkazakiL 
120071) of conditional random fields (iLaffertv etdH l200ll) . NERsuite is 
capable of efficiently incorporating features based on token matching against 
large-scale lexical resources, and the applied version achieves an F score of 
86.4% on the BioCreative II evaluation standard (GENETAG) dTanabe etaH 
1 20051) . effectively matching the performance of the best available systems 
for the task|3 



Following initial sentence splitting and tokenization, we perform 
lemmatization, POS - t agging and shallow parsing using the GENIA tagger 
( iTsuruoka and Tsuiiii l2005h . Next, we optionally perform a matching step 



using dictionaries c ompiled from the UM LS Metathesaurus jBodenreidei 



2004|) . Entrez Gene <Maglott flZll2005b and OBO Foundry ISmith et al. 



20071) resources. We then extract a comprehensive set of features for machine 
learning, building on orthographic, lexical, syntactic and dictionary match 
information (see Supplementary information). 

Following preliminary development test experiments, we chose to apply 
a single model that jointly predicts all entity types. In the final experiments, 
we compare a base model using only from the newly annotated data without 
external resources with a <i/c^/o^flry-supported model that incorporates 
features from matching against the lexical resources derived from UMLS, 
Entrez Gene and OBO foundry ontologies. 

3.2 Event extraction 

For event extraction, we appHed EventMine[3 a pipeHne-based event 
extraction system using support vector machines (SVM). EventMine takes 
as input document text and entity annotations, and extracts event structures 
and modifications. EventMine outperforms the best systems participating in 
the original BioNLP Shared Task 2011 on the GE and ID data sets (with 
F scores 58.0% and 57.6%, respective l y) and is competitive with the best 
systems on the EPI data set <Kim et fl/.lboilbUMiwa <2/.ll2012h . 

EventMine consists of four modules: (i) event trigger detection marks 
likely triggers and assigns them types, (ii) argument detection identifies 
likely trigger-argument pairs and assigns them roles, (iii) multi-argument 
event detection combines trigger- argument pairs into likely event structures 
and (iv) modification detection assigns modification flags {Negation and 



^http ://nersuite .nlplab . org 



^http://www.nactem. ac .uk/EventMine/ 
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Speculation). Each module addresses its task as a multi-label c lassification 
probl em, using the one-versus-rest SVM implementation of jpan et all 
l2008h , with a rich feature set generated from tokens an d paths in the 
predi cate-argument structure analyses of the Enju parser ( iMivao et all 
l2009h and the dependency analyses of the GDep parser ( Sagae and 
Tsujii, I2OO7I) . In feature generation, EyentMine applies semantic class 
generalization — e.g. merging Positive regulation and Regulation types 
for some features — to reduce the data sparsity and the number of different 
classes in the classification problems. In addition to training EyentMine on 
the newly introduced corpus, we also introduced a set of generalization 
rules appropriate t o the introduced types. We refer to supplementary 
documentation and dMiwa et fl/.Ll2012h for further details on EyentMine. 

We performed eyent extraction experiments in two settings: training only 
on the newly introduced data (base model) and training using stacking, 
incorporating pre dictions from a mo del trained on the BioNLP Shared Task 
2011 GE data set iKim et fl/ll2011bh as the source corpus. No other external 
resources were used in the evaluation. 

3.3 Experimental setup 

The annotated data were initially divided into training, development and test 
sets. The test set was held out during method development and parameter 
selection. For the final experiment, methods were trained on the combination 
of training and development data and evaluated on the test set. 

We evaluate both entity mention detection and event extraction 
performance using the standard precision, recall and F scorfl metrics, 
microaveraged over instance-level true-positive, false-positive and false- 
negative counts. 

For entity mention detection, we appl y the evaluation p rotocol and tools of 
the BioNLP/JNLPB A shared task 2004 jKim et all\2004) . evaluating results 
using three matching criteria: exact span match, left boundary match and right 
boundary match. The first requires the extent of a predicted entity mention 
to be identical to that of a gold mention for the prediction to be considered 
correct, whereas the latter two only require one of the boundaries defining the 
extent to match. We require the type of the predicted and annotated entities 
to be identical in all cases. 

For event extraction, we adapt th e evaluation protocol and tools introduced 
in the BioNLP Shared Task 2011 <Kim et fldboilah , including providing 
gold entity annotations as given for event extraction. We apply the primary 
matching criteria defined in the task, which otherwise require event structures 
to be identical but include the approximate span and approximate recursive 
relaxations to exact match: the former allows small variation in predicted 
event trigger spans and the latter permits differences in the secondary 
arguments of recursive event s tructures for matches. For detailed definitions, 
we refer to dKim et JIII2OI lal) . 



4 RESULTS AND DISCUSSION 

We next present the primary results of the annotation effort and the 
entity mention detection and event extraction experiments. 

4.1 Annotation effort and results 

We estimate the concentrated effort to produce the corpus annotation 
to have totalled approximately 250 hours, of which approximately 
100 hours used on guideline development, management and 
annotation consistency checking. The effort required to produce 
structured event annotation is thus broadly comparable to the initial 
effort by domain experts to mark text spans of interest dWang et o/l 

Imil). 

Table [3] presents the overall statistics of the annotated multi- 
level event extraction (MLEE) corpus. We note that the texts 

^^Specifically F\ = ^ where p is precision and r recall. 



Table 3. Overall corpus statistics 



Item 


Train 


Devel 


Test 


Total 


Document 


131 


44 


87 


262 


Sentence 


1271 


457 


880 


2608 


Word 


27 875 


9610 


19 103 


56 588 


Entity 


4147 


1431 


2713 


8291 


Organism 


359 


126 


237 


722 


Anatomy 


1844 


589 


1166 


3599 


Molecule 


1944 


716 


1310 


3970 


Event 


3296 


1175 


2206 


6677 


Anatomical 


810 


269 


596 


1675 


Molecular 


340 


125 


240 


705 


General 


1851 


627 


1176 


3654 


Planned 


295 


154 


194 


643 


See Tables 0and[2]for entity and event categories. 






Table 4. Comparison of corpus statistics with BioNLP Shared Task 2011 


corpora annotated using the same representation 






Item 


MLEE 


EPI 


GE 


ID 


Document 


262 


1200 


1224 


30^ 


Word 


56 588 


253 628 


348 908 


153 153 


Entity 


8291 


15190 


21616 


12740 


Event 


6677 


3714 


24967 


4150 



The ID document count is low as the corpus consists of full-text documents, not 
abstracts. 



Troponin (inhibits EC 



i Cell prolifferafwn I } BirKJ'm 

proliferation by interaction witti the bFGF receptor 



Fig. 4. Example Negative regulation (-Reg) event connecting entities at 
different levels of biological organization 



include comparable numbers of molecular and anatomy-level entity 
mentions, with a lower but still notable number of organism 
mentions. The event counts show a higher density of anatomical than 
molecular-level events, although general biological events dominate 
overall. Overall, 1222 events, or 18% of the total, involve either 
directly or indirectly (through participating events) arguments at 
both the molecular and anatomy levels (Fig. |4]). Table |4] presents 
corpus statistics with reference to those for the three largest event- 
annotated corpora in the recent BioNLP shared task 2011. We 
note that although the MLEE corpus is smaller than these corpora 
focusing on the molecular level in terms of e.g. word count, there is 
less difference in the number of entity annotations, and the MLEE 
corpus has more event annotations than two of the shared task 
corpora. The introduced corpus thus has a very high density of event 
annotations, which we attribute in part to the novel entity and event 
types allowing a more comprehensive representation of statements 
in text. 

We refer to Supplementary Material Section 1.3 for an evaluation 
of the corpus annotation consistency. 
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Table 5. Overall entity mention detection results (prec/rec/F score) 







Matching criterion 


Model 


Exact 


Left boundary 


Right boundary 


Base 

Dictionary 


77.03/69.18/72.89 
79.49/73.77/76.52 


79.85/71.72/75.57 
82.59/76.64/79.50 


82.47/74.07/78.04 
84.68/78.58/81.52 


Table 6. Entity mention detection results by category for dictionary model 
(prec/rec/F score) 






Matching 


; criterion 


Category 


Exact 


Left boundary 


Right boundary 


Organism 
Anatomy 
Molecule 


90.82/82.10/86.24 
77.47/72.70/75.01 
79.37/73.25/76.18 


91.79/82.97/87.16 
78.67/73.83/76.17 
84.54/78.03/81.15 


91.79/82.97/87.16 
84.58/79.38/81.90 
83.54/77.10/80.19 



4.2 Entity mention detection 

The overall evaluation results for entity mention detection are 
listed in Table \5\ We find a consistent benefit from the use of the 
lexical resources, with e.g. a 3.6% point improvement in F score 
(15% reduction in error) for strict matching. As expected, evaluated 
performance is notably higher under the relaxed criteria, in particular 
for right boundary matching. This suggests comparatively many 
errors in the choice of noun premodifiers included in annotation 
span, a distinction that may not be of critical importance for many 
applications. 

Table [6] lists a breakdown of performance by entity category for 
the dictionary model. The detection of Organism mentions is most 
reliable despite their sparseness in the data, conforming to previous 
results in dicating this entity c lass to represent a comparatively easy 
problem (iGerner a/.Ll201cl) . The detection of mentions of entities 
of the Anatomy and Molecule categories can be performed at 
broadly comparable accuracy on this corpus containing balanced 
numbers of annotations of the two, suggesting that fine-grained 
anatomical entity detection is no more difficult than established 
molecular level entity detection tasks. 

The overall entity mention detection performance, approaching 
or exceeding 80% in F score depending on evaluation criteria, is 
a very promising result given the novelty of the task and its many 
challenging aspects, most obviously that it involves more than 10 
distinct entity types. As points of comparison, the best single system 
at the well-established single-class BioCreative 2 Gene Mention task 
achieved an F score of 87.2% under matching criteria that in cases 
accept more than one specific span as correct (IWilbur et allhOOH) 
and the highest-performing system at the original BioNLP/JNLPBA 
shared task, involving the detection of entities of five different types, 
achieved an F sc ore of 72.6% under the exact matching criterion 
(iKim et a/.LllOOi) . 

4.3 Event extraction 

The overall results for event extraction using EventMine are 
presented in Table |7] The results demonstrate that the stacked 
model incorporating information from the previously introduced GE 



Table 7. Overall event extraction results 



Model 


Prec 


Rec 


F score 


Base 


56.53 


48.72 


52.34 


Stacking (GE) 


56.38 


50.77 


53.43 


Table 8. Event extraction results by category for stacked model 


Category 


Prec 


Rec 


F score 


Anatomical 


80.91 


72.05 


76.22 


Molecular 


68.44 


75.63 


71.86 


General 


43.87 


38.99 


41.29 


Planned 


56.68 


51.96 


54.22 


Modification 


47.95 


29.92 


36.85 



Event categories as defined in Table |2] Modification gives performance for Negation 
and Speculation detection. 



corpus outperforms a purely corpus-internal model. Although the 
improvement from incorporating the independently annotated out- 
of-domain data is somewhat modest, the result does indicate that the 
annotation has met its aim to maintain compatibility with this key 
resource for molecular-level event annotation. 

As for entity mention detection, performance for the best model, 
at over 50% F score for event extraction, is very promising for 
a first experiment on the new task. Eor reference, the best results 
in the recent, widely attended BioNLP Shared Task 2011 for the 
same evaluation criteria were 56.0% F score for the GE task, 
53.3% F score for the EPI task a nd 55.6% F score for the ID 
task (Table lU (iKim et all l2011bh . Reaching this general level 
of performance suggests that the task is feasible for current event 
extraction technology and that the annotation consistency and the 
size of the introduced corpus are sufficient for reliable extraction. 

Table [8] gives a breakdown of the event extraction performance 
by category. Interestingly, we find that events involving anatomical 
entities are more reliably extracted than those involving 
molecular-level ones, despite the model incorporating information 
from a corpus with a larger number of molecular level event 
annotations than the total number of annotations in the MLEE 
corpus. This is a very encouraging finding for event extraction 
for anatomical processes, indicating that the representation and 
extraction methods are well suited for the task. 

5 CONCLUSION 

We have presented the MLEE corpus, a resource aiming to extend the 
coverage of resources and methods for structured event extraction 
from the molecular level to encompass all levels from the subcellular 
to the organism. Experiments using state-of-the-art entity mention 
detection and event extraction methods demonstrated that the newly 
proposed extraction targets can be met with reasonable performance 
using the MLEE corpus, with approximately 80% overall F score for 
entity mention detection and over 50% F score for event extraction 
using standard evaluation criteria. 

In future work, we will focus on the extension of the annotations 
and extraction methods to improve the domain independence of 
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our annotation to allow the application of the introduced extraction 
methods at large scale to automatically annotate the entire available 
literature. The results of these extraction efforts will be made 
available through search systems such as MEDIE to further improve 
access to the biomedical literature by facilitating structured semantic 
queries across multiple levels of biological organization, for example 
to find statements regarding the inhibition of organ growth by 
specific molecular-level entities or events. 

All resources introduced in this study, including the annotated 
corpus, guidelines, the evaluation tools and the methods are available 
from http : //nactem. ac .uk/MLEE/. 
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